feat!: lbug-only graph backend; rip DuckDB graph adapter#117
Merged
Conversation
DuckDB no longer implements IGraphStore. lbug (`@ladybugdb/core`) is the
sole graph backend; DuckDB stays as the temporal-only sidecar
(cochanges, symbol summaries, sql escape hatch, embeddings staging for
the deterministic Parquet sidecar). The auto-probe / dual-artifact /
CODEHUB_STORE resolver, the stale-artifact detection, and ~1900 LOC of
DuckDB graph-tier code are all gone.
Storage shape after the rip:
- `openStore({path})` always returns `{graph: GraphDbStore, temporal:
DuckDbStore, graphFile, temporalFile, close}`. No `backend` field on
the result, no `backend?` option on input.
- `<repo>/.codehub/graph.lbug` + `<repo>/.codehub/temporal.duckdb`.
`paths.describeArtifacts()` takes no arguments. `resolveDbPath` is
renamed `resolveGraphPath` and returns the lbug filename.
- The `IGraphStore` / `ITemporalStore` segregation stays — that's the
contract community AGE/Memgraph/Neo4j/Neptune adapters target. The
v1.0 conformance suite stays for the same reason.
- Embeddings live in graph.lbug; the pack sidecar streams them through
a per-call DuckDB temp table on temporal.duckdb so the byte-identical
Parquet writer still works.
- MCP `sql` tool's `cypher` field becomes unconditionally available.
lbug operational fixes captured in the same change:
- Pool now passes explicit `maxDbBytes=16 GiB` and
`bufferManagerBytes=2 GiB` so concurrent test Databases don't exhaust
the 47-bit user VA on Linux (default `maxDBSize=1<<43` = 8 TiB
reserves at construction). Citations: kuzudb/kuzu#1826,
`BufferPoolConstants::DEFAULT_VM_REGION_MAX_SIZE`.
- Bulk-load STRING[] sentinel switched from `[]` to `["__sentinel__"]`
so lbug's struct-field type inference doesn't resolve to LIST(ANY).
Empty-array sentinels surface as "Trying to create a vector with ANY
type" the moment a data row supplies a string.
- `ensureFtsIndex` / `ensureVectorIndex` no-op in readOnly mode, and
bulkLoad runs them at the end of the write path so readers don't
trigger writes on lbug.
ADR 0016 records the rip-out and supersedes ADR 0013 entirely; ADR
0011's "DuckDB-default + LadybugDB opt-in" framing is partially
superseded.
Net diff: +1297 / -7391 (6094 net deletions across 60 files).
Workspace verdict: `mise run check` exit 0 — 1931 passing tests, 0
failing, 2 platform-skipped.
…n before delete
Two lbug-vs-DuckDB-graph behavioral gaps surfaced by self-scan against
the OCH repo:
1. lbug's COPY enforces that every relation's from/to is a real CodeNode
primary key. The pipeline's fetches phase emits synthetic targets
(e.g. `fetches:unresolved:GET:/users/1`) carrying the URL template in
`reason`; these intentionally have no node. DuckDB silently accepted
them; lbug rejects with `Copy exception: Unable to find primary key
value`. Synthesize a Route placeholder for every orphan edge target
before insertNodes; downstream tools recognise the well-known prefix.
2. lbug builds the FTS index against CodeNode; deleting from CodeNode
(truncateAll in replace mode, mergeNodes per-id in upsert mode)
without the FTS extension loaded surfaces `Binder exception: Trying
to delete from an index on table CodeNode but its extension is not
loaded`. ingest-sarif's bulkLoad(graph, {mode: "upsert"}) hits this
on every analyze run after the first. Load the extension at the top
of bulkLoad so both modes' deletes succeed; failures are swallowed
on platforms without FTS so the search-side codepath surfaces the
clearer error later.
Verified: `codehub analyze .` runs end-to-end on the OCH repo (18,893
findings ingested via SARIF). Full workspace tests still 1931 pass / 0
fail. mise run check exit 0.
Merged
theagenticguy
added a commit
that referenced
this pull request
May 16, 2026
🤖 Automated release via release-please --- <details><summary>analysis: 0.3.0</summary> ## [0.3.0](analysis-v0.2.0...analysis-v0.3.0) (2026-05-16) ### ⚠ BREAKING CHANGES * lbug-only graph backend; rip DuckDB graph adapter ([#117](#117)) ### Features * lbug-only graph backend; rip DuckDB graph adapter ([#117](#117)) ([49e14fd](49e14fd)) ### Dependencies * The following workspace dependencies were updated * dependencies * @opencodehub/storage bumped to 0.2.0 * @opencodehub/wiki bumped to 0.2.0 </details> <details><summary>cli: 0.5.0</summary> ## [0.5.0](cli-v0.4.0...cli-v0.5.0) (2026-05-16) ### ⚠ BREAKING CHANGES * drop detect-secrets; ship tuned betterleaks default config ([#118](#118)) * lbug-only graph backend; rip DuckDB graph adapter ([#117](#117)) ### Features * drop detect-secrets; ship tuned betterleaks default config ([#118](#118)) ([d370f9e](d370f9e)) * lbug-only graph backend; rip DuckDB graph adapter ([#117](#117)) ([49e14fd](49e14fd)) ### Dependencies * The following workspace dependencies were updated * dependencies * @opencodehub/analysis bumped to 0.3.0 * @opencodehub/ingestion bumped to 0.4.1 * @opencodehub/mcp bumped to 0.4.0 * @opencodehub/pack bumped to 0.2.0 * @opencodehub/scanners bumped to 0.2.0 * @opencodehub/search bumped to 0.2.0 * @opencodehub/storage bumped to 0.2.0 * @opencodehub/wiki bumped to 0.2.0 </details> <details><summary>cobol-proleap: 0.1.5</summary> ## [0.1.5](cobol-proleap-v0.1.4...cobol-proleap-v0.1.5) (2026-05-16) ### Dependencies * The following workspace dependencies were updated * dependencies * @opencodehub/ingestion bumped to 0.4.1 </details> <details><summary>ingestion: 0.4.1</summary> ## [0.4.1](ingestion-v0.4.0...ingestion-v0.4.1) (2026-05-16) ### Dependencies * The following workspace dependencies were updated * dependencies * @opencodehub/analysis bumped to 0.3.0 * @opencodehub/scip-ingest bumped to 0.2.2 * @opencodehub/storage bumped to 0.2.0 </details> <details><summary>mcp: 0.4.0</summary> ## [0.4.0](mcp-v0.3.2...mcp-v0.4.0) (2026-05-16) ### ⚠ BREAKING CHANGES * lbug-only graph backend; rip DuckDB graph adapter ([#117](#117)) ### Features * lbug-only graph backend; rip DuckDB graph adapter ([#117](#117)) ([49e14fd](49e14fd)) ### Dependencies * The following workspace dependencies were updated * dependencies * @opencodehub/analysis bumped to 0.3.0 * @opencodehub/pack bumped to 0.2.0 * @opencodehub/scanners bumped to 0.2.0 * @opencodehub/search bumped to 0.2.0 * @opencodehub/storage bumped to 0.2.0 </details> <details><summary>pack: 0.2.0</summary> ## [0.2.0](pack-v0.1.4...pack-v0.2.0) (2026-05-16) ### ⚠ BREAKING CHANGES * lbug-only graph backend; rip DuckDB graph adapter ([#117](#117)) ### Features * lbug-only graph backend; rip DuckDB graph adapter ([#117](#117)) ([49e14fd](49e14fd)) ### Dependencies * The following workspace dependencies were updated * dependencies * @opencodehub/analysis bumped to 0.3.0 * @opencodehub/ingestion bumped to 0.4.1 * @opencodehub/storage bumped to 0.2.0 </details> <details><summary>scanners: 0.2.0</summary> ## [0.2.0](scanners-v0.1.2...scanners-v0.2.0) (2026-05-16) ### ⚠ BREAKING CHANGES * drop detect-secrets; ship tuned betterleaks default config ([#118](#118)) ### Features * drop detect-secrets; ship tuned betterleaks default config ([#118](#118)) ([d370f9e](d370f9e)) </details> <details><summary>scip-ingest: 0.2.2</summary> ## [0.2.2](scip-ingest-v0.2.1...scip-ingest-v0.2.2) (2026-05-16) ### Dependencies * The following workspace dependencies were updated * dependencies * @opencodehub/analysis bumped to 0.3.0 </details> <details><summary>search: 0.2.0</summary> ## [0.2.0](search-v0.1.2...search-v0.2.0) (2026-05-16) ### ⚠ BREAKING CHANGES * lbug-only graph backend; rip DuckDB graph adapter ([#117](#117)) ### Features * lbug-only graph backend; rip DuckDB graph adapter ([#117](#117)) ([49e14fd](49e14fd)) ### Dependencies * The following workspace dependencies were updated * dependencies * @opencodehub/storage bumped to 0.2.0 </details> <details><summary>storage: 0.2.0</summary> ## [0.2.0](storage-v0.1.2...storage-v0.2.0) (2026-05-16) ### ⚠ BREAKING CHANGES * lbug-only graph backend; rip DuckDB graph adapter ([#117](#117)) ### Features * lbug-only graph backend; rip DuckDB graph adapter ([#117](#117)) ([49e14fd](49e14fd)) </details> <details><summary>wiki: 0.2.0</summary> ## [0.2.0](wiki-v0.1.1...wiki-v0.2.0) (2026-05-16) ### ⚠ BREAKING CHANGES * lbug-only graph backend; rip DuckDB graph adapter ([#117](#117)) ### Features * lbug-only graph backend; rip DuckDB graph adapter ([#117](#117)) ([49e14fd](49e14fd)) ### Dependencies * The following workspace dependencies were updated * dependencies * @opencodehub/storage bumped to 0.2.0 </details> <details><summary>root: 0.6.0</summary> ## [0.6.0](root-v0.5.0...root-v0.6.0) (2026-05-16) ### ⚠ BREAKING CHANGES * drop detect-secrets; ship tuned betterleaks default config ([#118](#118)) * lbug-only graph backend; rip DuckDB graph adapter ([#117](#117)) ### Features * drop detect-secrets; ship tuned betterleaks default config ([#118](#118)) ([d370f9e](d370f9e)) * lbug-only graph backend; rip DuckDB graph adapter ([#117](#117)) ([49e14fd](49e14fd)) ### Bug Fixes * **ci:** grant id-token: write at release-please.yml top level ([#115](#115)) ([a87a6eb](a87a6eb)) * **ci:** install betterleaks via mise so the pre-release gate finds it ([#120](#120)) ([522a4ec](522a4ec)) * **ci:** pre-release gate aggregator needs betterleaks (was detect-secrets) ([#119](#119)) ([a6f3448](a6f3448)) </details> --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Laith Al-Saadoon <alsaadoonlaith@gmail.com>
theagenticguy
added a commit
that referenced
this pull request
May 29, 2026
…ge graph
`codehub sql` and the MCP `sql` tool's `sql` arg run against the DuckDB
temporal store (cochanges + symbol_summaries). The node/edge graph moved
to lbug in ADR 0016 and is reached via the typed tools or Cypher. Docs
across the repo still said "SQL against the graph store", and the
opencodehub-guide skill shipped a whole "Graph schema" + "SQL cheat-sheet"
of `SELECT FROM nodes/relations` queries that ERROR against the current
store (field-report Issue 4).
Reworded every user-facing site: CLAUDE.md, AGENTS.md,
packages/cli/README.md, the `sql` --help in cli/src/index.ts,
cli/src/agent-context.ts, docs reference/cli.md + tool-decision-matrix.mdx,
and both copies of the opencodehub-guide SKILL.md (.claude + plugins). The
SKILL cheat-sheet is rewritten to REAL lbug Cypher: single `:CodeNode`
label with `kind` as a property, snake_case props, per-type relationship
labels, plus a small temporal-SQL cochanges example. Also fixed two MCP
next-step hints (dependencies, list-findings) that told users to run a
relations SELECT — now a Cypher MATCH on the relationship label.
No production query behavior changes — guidance and strings only.
Typecheck and biome clean. A pre-existing, unrelated mcp unit test
("impact surfaces cochanges") fails on clean main from cochange-harness
drift since the #117 DuckDB-graph rip; not in any path this PR touches and
not a required check. Tracked separately.
3 tasks
theagenticguy
added a commit
that referenced
this pull request
May 29, 2026
…ge graph (#173) ## Summary `codehub sql` and the MCP `sql` tool's `sql` arg run against the **DuckDB temporal store** (`cochanges` + `symbol_summaries`). The node/edge graph moved to lbug in **ADR 0016** and is reachable only via the typed tools or **Cypher**. Docs across the repo still said "SQL against the graph store" (field-report Issue 4), and the `opencodehub-guide` skill shipped a whole "Graph schema" + "SQL cheat-sheet" of `SELECT … FROM nodes/relations` queries that **error** against the current store. ## Changes (guidance/strings only — no query behavior change) - Reworded every user-facing site: `CLAUDE.md`, `AGENTS.md`, `packages/cli/README.md`, the `sql` `--help` in `cli/src/index.ts`, `cli/src/agent-context.ts`, `docs/reference/cli.md`, `docs/agents/tool-decision-matrix.mdx`. - **Rewrote the `opencodehub-guide` SKILL.md cheat-sheet** (both copies: `.claude/skills` + `plugins/`) to **real lbug Cypher**: single `:CodeNode` label with `kind` as a property, snake_case props (`file_path`, `start_line`, `step_count`, …), per-type relationship labels (`[r:CALLS]`), plus a small temporal-SQL `cochanges` example. The framing now states the two-store split explicitly. - Fixed two **MCP next-step hints** (`dependencies`, `list-findings`) that told users to run `SELECT * FROM relations …` → now `MATCH ()-[r:…]->() RETURN r` Cypher. ## Test plan - [x] `grep` confirms zero `FROM relations` / `FROM nodes` / "DuckDB-backed graph" guidance remains in live docs. - [x] Both SKILL.md copies are byte-identical and contain no broken SQL. - [x] `tsc --noEmit` + `biome` clean on the touched code files. ## Heads-up (not caused by this PR) A pre-existing mcp unit test, `impact surfaces cochanges` (`tool-handlers.test.js`), fails on **clean main** — the cochange test-harness wiring drifted since #117 (the DuckDB-graph rip). It's not in any path this PR touches and is **not** a branch-protection-required check (the mcp unit suite is the non-required `test` job). Tracked separately for a focused harness fix.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
IGraphStore. lbug (@ladybugdb/core) is the sole graph backend; DuckDB stays as the temporal-only sidecar (cochanges, symbol summaries, sql escape hatch, deterministic embeddings Parquet via temp-table staging).CODEHUB_STOREresolver and ~1900 LOC of DuckDB graph-tier code are gone.openStore({path})always returns{graph, temporal, graphFile, temporalFile, close}overgraph.lbug+temporal.duckdb.IGraphStore/ITemporalStoresegregation is preserved as the v1.0 community-adapter contract (AGE / Memgraph / Neo4j / Neptune still targetIGraphStore).Operational fixes captured in the same change
maxDBSizevirtual-address exhaustion — default1 << 43= 8 TiB perDatabasereserved at construction. Pool now passes explicitmaxDbBytes=16 GiB+bufferManagerBytes=2 GiB(citations: Buffer manager exception kuzudb/kuzu#1826,BufferPoolConstants::DEFAULT_VM_REGION_MAX_SIZE).LIST(ANY)runtime trap. Sentinel now seeds["__sentinel__"];WITH r WHERE r.id <> SENTINELfilters before COPY.CALL CREATE_FTS_INDEX/CALL CREATE_VECTOR_INDEXon a readOnly Database — surfaced as "Cannot execute write operations in a read-only database!" the moment a reader calledsearch(). Fix: build both at end ofbulkLoad; readOnly opens skip index creation.Net diff
+1545 / -7391across 60 files (5,800 net deletions).Test plan
mise run checkexit 0 (lint + typecheck + banned-strings + build + test)@ladybugdb/corebinding installs on your platform — Alpine/musl users need cmake-js source build (Wave 4 doctor flips to fail-hard)🤖 Generated with Claude Code